Spectro-temporal modulation based singing detection combined with pitch-based grouping for singing voice separation
نویسندگان
چکیده
A spectro-temporal modulation based singing voice detection cascaded with a Viterbi based pitch tracking algorithm is proposed in this paper for singing-voice separation from monaural recordings. To detect the singing voice, the spectrotemporal modulation energy related to voice harmonics is extracted using a spectro-temporal modulation analysis framework developed for the Fourier spectrogram. Separation of singing-voice from background music is conducted using a binary mask to group estimated harmonics of singing voice. The proposed system is evaluated using MIR-1K dataset and is shown outperforming three other binary-mask based systems in the vocal/music separation task.
منابع مشابه
Singing Voice Separation from Monaural Recordings
Separating singing voice from music accompaniment has wide applications in areas such as automatic lyrics recognition and alignment, singer identification, and music information retrieval. Compared to the extensive studies of speech separation, singing voice separation has been little explored. We propose a system to separate singing voice from music accompaniment from monaural recordings. The ...
متن کاملA two-stage singing voice separation algorithm using spectro-temporal modulation features
A two-stage singing voice separation algorithm using spectrotemporal modulation features is proposed in this paper. First, music clips are transformed into auditory spectrograms and the spectral-temporal modulation contents of all time-frequency (T-F) units of the auditory spectrograms are extracted using an auditory model. Then, T-F units are sequentially clustered using the expectation-maximi...
متن کاملSeparation and Classification of Harmonic Sounds for Singing Voice Detection
This paper presents a novel method for the automatic detection of singing voice in polyphonic music recordings, that involves the extraction of harmonic sounds from the audio mixture and their classification. After being separated, sounds can be better characterized by computing features that are otherwise obscured in the mixture. A set of descriptors of typical pitch fluctuations of the singin...
متن کاملPitch Estimation of Singing Voice From Monaural Popular Music Recordings
A singing voice separation system is a hard yet popular task in the field of music information retrieval (MIR). If successfully separated, a number of algorithms can be applied to vocal melody for any possible application. In this study, we applied a pitch estimation algorithm after separating a singing voice from background music based on the implementation of REPET [1]. Then we evaluated our ...
متن کاملSinging Voice Separation Using Spectro-Temporal Modulation Features
An auditory-perception inspired singing voice separation algorithm for monaural music recordings is proposed in this paper. Under the framework of computational auditory scene analysis (CASA), the music recordings are first transformed into auditory spectrograms. After extracting the spectral-temporal modulation contents of the timefrequency (T-F) units through a two-stage auditory model, we de...
متن کامل